Internet data packet transport: from global topology to local queueing dynamics 
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We study structural feature and evolution of the Internet at the autonomous systems level. Ex- 
tracting relevant parameters for the growth dynamics of the Internet topology, we construct a toy 
model for the Internet evolution, which includes the ingredients of multiplicative stochastic evolution 
of nodes and edges and adaptive rewiring of edges. The model reproduces successfully structural 
features of the Internet at a fundamental level. We also introduce a quantity called the load as 
the capacity of node needed for handling the communication traffic and study its time-dependent 
behavior at the hubs across years. The load at hub increases with network size iVas~iVi «. Finally, 
we study data packet traffic in the microscopic scale. The average delay time of data packets in 
a queueing system is calculated, in particular, when the number of arrival channels is scale-free. 
We show that when the number of arriving data packets follows a power law distribution, ~ n""^, 
the queue length distribution decays as n^~^ and the average delay time at the hub diverges as 
~ Ai'(3-^)/(T-i) in the iV ^ oo limit when 2 < A < 3, 7 being the network degree exponent. 

PACS numbers: 89.75.Hc, 89.70.-l-c, 89.75.Da 



In recent years, the Internet has become one of the 
most influential media in our daily life, going beyond 
in its role as the basic infrastructure in this technologi- 
cal world. Explosive growth in the number of users and 
hence the amount of trafSc poses a number of problems 
which are not only important in practice for, e.g., main- 
taining it free from any undesired congestion and mal- 
functioning, but also of theoretical interests as an inter- 
disciplinary topic H. Such interests, also stimulated by 
other disciplines like biology, sociology, and statistical 
physics, have blossomed into a broader framework of net- 
work science H, H, 1^ . In this Letter, we first review 
briefly previous studies of Internet topology and the data 
packet transport on global scale, and next study the de- 
livery process in queueing system of each node embedded 
in the Internet. 

The Internet is a primary example of complex net- 
works. It consists of a large number of very heteroge- 
neous units interconnected with various connection band- 
widths, however, it is neither regular nor completely ran- 
dom. In their landmark paper, Faloutsos et al. ^gj showed 
that the Internet at the autonomous systems (ASes) level 
is a scale- free (SF) network 0, meaning that degree fc, 
the number of connections a node has, follows a power- 
law distribution, 

Pd(fc) - k-\ (1) 

The degree exponent 7 is subsequently measured and 
confirmed in a number of studies to be 7 « 2.1(1). The 
power-law degree distribution implies the presence of a 
few nodes having a large number of connections, called 
hubs, while most other nodes have a few number of con- 



nections. 

It is known that the degrees of the two nodes located at 
each end of a link are correlated each other. As the first 
step, the degree-degree correlation can be quantified in 
terms of the mean degree of the neighbors of a given node 
with degree A: as a function of fc, denoted by {knn){k) 0, 
which behaves in another power law as 

{knn){k) k-\ (2) 

For the Internet, it decays with v fa O.b measured from 
the real- world Internet data 0, . 

The Internet has modules within it. Such modular 
structures arise due to regional control systems, and of- 
ten form in a hierarchical way [Tll| . Recently, it was ar- 
gued that such modular and hierarchical structures can 
be described in terms of the clustering coefficient. Let 
Ci be the local clustering coefficient of a node i, defined 
as Ci = 2ei/ki{ki — 1), where ei is the number of links 
present among the neighbors of node i, out of its maxi- 
mum possible number ki{ki — l)/2. The clustering coeffi- 
cient of a network, C, is the average of Ct over all nodes. 
C{k) means the clustering function of a node with degree 
k, i.e., Ci averaged over nodes with degree k. When a 
network is modular and hierarchical, the clustering func- 
tion follows a power law, C{k) ~ for large k, and 
C is independent of system size N [HP. For the In- 
ternet, it was measured that the clustering coefficient is 
Cas ~ 0.25 and the exponent /3 « 0.75 [l4||. 

There are many known models to mimic the Internet 
topology. Here we introduce our stochastic model evolv- 
ing through the following four rules. This model is based 
on the model proposed by Huberman and Adamic |l5j |. 
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FIG. 1: Shown is the adaptive rewiring rule. A node (white) 
detaches one of its links from a node (green or gray) in (a) , 
and attaches it to one of the nodes (green or gray) with degree 
3, larger than 2 of the detached node, in (b). 



FIG. 2: The load at each node due to a unit packet transfer 
from the node s to the node t, £1^^. In this diagram, only 
the nodes along the shortest paths between (s, t) are shown. 
The quantity in parentheses is the corresponding value of the 
load due to the packet from t to s, H^^^" ■ 



which is a generic model to reproduce a uncorrelated 
SF network and we modify it by adding the adaptation 
rule ,16|, which results in generating the degree-degree 
correlations. The rules are as follows: (i) Geometrical 
growth: At time step t, geometrically increased number 
of new nodes, aN(t — 1), are introduced in the system 
with the empirical value oi a ~ 0.029. Then following 
the empirical fact (fcnew)i ~ 1.34, each of newly added 
nodes connects to one or two existing nodes according to 
the preferential attachment (PA) rule (h) Acceler- 

ated growth: Each existing node increases its degree by 
the factor empirical value of « 0.035. These new inter- 
nal links are also connected following the PA rule, (iii) 
Fluctuations: Each node disconnects existing links ran- 
domly or connects new links following the PA rule with 
equal probability. The variance of this noise is given as 
« (0.14)^ measured from empirical data, (iv) Adapta- 
tion: When connecting in step (iii), the PA rule is applied 
only within the subset of the existing nodes consisting of 
those having larger degree than the one previously dis- 
connected. This last constraint accounts for the adapta- 
tion process. The adaptive rewiring rule is depicted in 
Fig.d 

Through this adaptation model, we can reproduce 
generic features of the Internet topologies successfully 
which are as follows: First, the degree exponent is mea- 
sured to be 7modci ~ 2.2, close to the empirical result 
7AS ~ 2.1(1). Second, the clustering coefficient is mea- 
sured to be Cmodei ~ 0.15(7), comparable to the em- 
pirical value Cas ~ 0.25. Note that without the adap- 
tation rule, we only get C « 0.01(1). The clustering 
function C{k) also behaves similarly to that of the real- 
world Internet, specifically, decaying in a power law with 
P « 1-1(3) roughly for large k 18], but the overall curve 
shifts upward and the constant behavior for small k ap- 
pears. Third, the mean degree function (fcnn)(/c) also 
behaves similarly to that of the real-world Internet net- 
work, but it also shifts upward overall. In short, the 
behaviors of C{k) and (fcnn)(fc) of the adaptation model 
are close to those of the real Internet AS map, but with 
some discrepancies described above. On the other hand, 
recently another toy model (l9l| has been introduced to 



represent the evolution of the Internet topology. The 
model is similar to our model in the perspective of in- 
cluding the multiplicative stochastic evolution of nodes 
and edges as well as adaptive rewiring of edges. However, 
the rewiring dynamics is carried out with the incorpora- 
tion of user population instead of degree of node we used 
here. 

Next, we study the transport of data packet on the 
Internet. Data packets are sent and received over it con- 
stantly, causing momentary local congestion from time 
to time. To avoid such undesired congestion, the capac- 
ity, or the bandwidth, of the routers should be as large 
as it can handle the traffic. First we introduce a rough 
measure of such capacity, called the load and denoted 
as i [23|. One assumes that every node sends a unit 
packet to everyone else in unit time and the packets are 
transferred from the source to the target only along the 
shortest paths between them, and divided evenly upon 
encountering any branching point. To be precise, let ^* 
be the amount of packet sent from s (source) to t (tar- 
get) that passes through the node i (see Fig. 12)). Then 
the load of a node i, ii, is the accumulated sum of ^f^* 
for all s and t, ii = J2s^t ^f^*- other words, the load 
of a node i gives us the information how much the capac- 
ity of the node should be in order to maintain the whole 
network in a free-flow state. However, due to local fluc- 
tuation effect of the concentration of data packets, the 
traffic could be congested even for the capacity of each 
node being taken as its load. The distribution of the load 
reflects the high level of heterogeneity of the Internet: It 
also follows a power law, 

Piii)-r', (3) 

with the load exponent S ~ 2.0 for the Internet. For 
comparison, the quantity "load" is different from the 
"betweenness centrality" in its definition. In load, 
when a unit packet encounters a branching point along 
the shortest pathways, it is divided evenly with the local 
information of branching number, while in betweenness 
centrality, it can be divided unevenly with the global in- 
formation of the total number of shortest pathways be- 
tween a given source and target Despite such a 
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FIG. 3: Time evolution of the load versus N{t) at the ASes 
of degree-rank 1(0), 2 (□), 3 (0), 4 (A), and 5 (x). The 
dashed line for larger A'^ has slope 1.8, drawn for the eye. 



difference, we find no appreciable difference in practice 
for the numerical values of the load and the betweenness 
centrality for a given network. 

The load of a node is highly correlated with its degree. 
This suggests a scaling relation between the load and the 
degree of a node as ^ ~ fc^ and the scaling exponent 
r] is estimated as ry = 1.06 ± 0.03 for January 2000 AS 
map 0,^3- I'^ f&ct, if one assumes that the ranks of 
each node for the degree and the load are the same, then 
one can show that the exponent rj depends on 7 and S as 
77 = (7— 1)/((5 — 1) with 7 « 2.1 and S ~ 2.0, and we have 
77 « 1.1, which is consistent with the direct measurement. 

The time evolution of the load at each AS is also of 
interest. Practically, how the load scales with the total 
number of ASes (the size of the AS map) is an impor- 
tant information for the network management. In Fig. 13 
we show £i{t) versus N{t) for 5 ASes with the highest 
rank in degree, i.e., 5 ASes that have largest degrees at 
t ~ 0. The data of {ii{t)} shows large fluctuations in 
time. Interestingly, the fluctuation is moderate for the 
hub, implying that the connections of the hub is rather 
stable. The load at the hub is found to scale with N{t) 
as £h{t) ^ N{t)'^, but the scaling shows a crossover from 
^ w 2.4 to ^ w 1.8 around t « 14. 

Internet traffic along the shortest pathways yields in- 
convenient queue congestions at hubs in SF networks. 
Many alternative routing strategies have been introduced 
to reduce the load at hub and improve the critical density 
of the number of packets d isplaying t he transition from 
free-flow to congested state psllM I2I I26ll27ll2ll29ll30l| . 

Transport of data packets also relies on queueing pro- 
cess of an individual AS. Here we extend existing queue- 
ing theory |3lj | to the case where arrival channels are 
multiple, in particular, when their number distribution 
follows a power law, aiming at understanding the trans- 
port in SF networks. For simplicity, we assume that the 
arrival and processing rates of an individual channel are 
the same, and they are independent of degree of a given 
AS. Time is discretized and unit time is given as the in- 
verse of the rate. 
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FIG. 4: (a) Snapshot inside buffer with arriving packets. Each 
row represents a communication channel, and circles therein 
are the sequence of incoming packets. The integers on the 
horizontal axis indicate arriving time-steps of each packet. 
Open circles stand for packets not delayed. Packets delayed 
are represented by three kinds of filled circles according to 
their own delaying mechanism. See text for details. The 
consequent delivery sequence is shown in (b) with processing 
time-step. 



Delay of packet delivery in our queueing process orig- 
inates from two sources. For the one, owing to multiple 
arriving channels, multiple packets can arrive at a given 
queueing system in a unit time interval, and are accumu- 
lated in the buffer. For example, grey circles in Fig. 01 
represent such a case. This type of delay is referred to as 
the delay type 1 (DTI) below. For the other, the delay 
is caused by preceding packets in the buffer, which can 
happen under the first-in-first-out rule. The hatched cir- 
cles in Fig.0]demonstrate this case. This case is referred 
to as the delay type 2 (DT2). Then any delay can be 
decomposed into the two types. The black circle in Fig. 4 
is such a packet, delayed by both DTI and DT2. We cal- 
culate the average delay time for each type, separately, 
and combine them next. 

To proceed, we first define p„ as the probability that 
n packets arrive at a given queueing system at the same 
time. For the DTI case, if qm denotes the probability 
that a packet is delayed m time steps, we find 



E Pn 



(4) 



n— m-t-l 



where Sij is the Kronecker delta function. Then, the 
average of delay time steps through the DTI process is 
obtained as 



Pn {n)p -l+Po 

2 ' (5) 

71—2 m—l 



where {■ ■ ■)q ((■ • ■ )p) is the average with respect to the 
probability (p„). 

For the DT2 case, we introduce rf,(t) as the probability 
that a packet arrived at time t is delayed b time steps by 
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preceding delayed packets. In the steady state, we obtain 
that 



b'+i 

Tb' = Pb'-b+m + pprpdo^b' 

6=0 



(6) 



By using the generating functions TZ{z) = X^blo '''b^'' 
V{z) = X^J^o^*"-^"' obtain that 



niz)[z-Viz)]=paro{z-l) 



(7) 



with po'^o = 1 - ("-)p- 

The next step is to combine the two types of delays. 
To this end, we define Wt as the probability that a unit 
packet is delayed by r. Then Wr = J2m=o ImTT-m since 
DTI and DT2 are statistically independent. From this, 
the average delay time is obtained as 

(t)^ =^TWr = + — . (8) 



T=0 



2(l-(n)p) 



Thus, a critical congestion occurs when {n)p = 1, at 
which the delay time diverges. The singular behavior 
in the form of (1 — {'n)p)~^ was observed numerically in 
the study of directed traffic flow in Euclidean space |32 |. 

We now consider the case where the number of arriving 
data packets follows a power law, p„ ~ rt~^. In fact, 
non-uniformity of the number of data packets arriving 
at a given node gives rise to self-similar patterns as is 
well known in computer science (3^ . Precise value of the 
exponent A has not been reported yet. Moreover, it is 
not known if the exponent is universal, independent of 
bandwidths or degrees in the SF network. The relation 
of A to the load exponent (5, if there is any, is not known 
either. 

If A < 3, {n'^)p diverges. For such a power-law distri- 
bution, its generating function 'P{z) develops a singular 
part and takes the form, when 2 < A < 3, 

V{z) = 1 - (n)p(l -z) + a{l - zf-^ + 0{{l- zf) , (9) 



where a is a constant. By using the relation between 
■p(z) and 7^(2;) from Eq. (0, we obtain that 



n{z) 



1 - {n).^ 



{i-zy-' + o{i-z). 



(10) 



Therefore, the probability in the delay of the DT2 
behaves as ~ h^~^ for large h. In other words, the DT2 
delay distribution decays slower than that of incoming 
packets, p„, and (t)^, (5),. becomes infinite even when 
{n)p < 1. 

On the other hand, in real finite scale-free networks 
such as the Internet with the degree exponent 7, p„ at 
the hub has a natural cut-off at n ~ k-^^x ^ Afi/(T~i), in 
which case we have (n^)p k^^. Thus from Eq. © the 
average delay time at the hub scales as 

for 2 < A < 3. 

In the real- world Internet, the bandwidth of each AS 
is not uniform. Nodes with high bandwidth locate at the 
core of the network, forming a rich club ^34, 35], how- 
ever, their degrees are small. Whereas, nodes with large 
degree locate at the periphery of the network with low 
bandwidth [3^ . Therefore, our analysis of the average 
delay time has to be generalized incorporating the inho- 
mogeneous bandwidths and arrival rates [37l |. 

In summary, in the first part of this Letter, we have 
reviewed the previous studies of topological properties 
of the Internet and introduced a minimal model, the 
adaptation model to reproduce the topological proper- 
ties. Next we studied transport phenomena of data pack- 
ets travelling along the shortest pathways from source to 
destination nodes in terms of the load. In the second 
part, we studied the delivery process of data packets in 
the queueing system, in particular, when arrival chan- 
nels are diverse following the scale-freeness in the degree 
distribution. 
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